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THE DUNIA PIPELINE 


250 GiB of package per Day 

Official + Test Maps 

Over 3.7 Million Lines Of Code 

2.5 M. C++ Runtime 

6 studios, 450 people in Montreal 

Getting data everyday 

-500 checkins per day 

350 data, 150 code (+sound) 
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DATA FLOW IN THE DUNIA PIPELINE 



- 1 perforce instance for artists, to keep history of source asset files 

- 1 other perforce instance, in which they export in a platform agnostic format, using 
plugins we provide 

- Design data also in this perforce instance (world description, entities, 
properties, ...) 

- Can run the editor with a copy of this data 

- JIT "compile" assets in order to use them, keep transformed asset on disk 
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DATA FLOW IN THE DUNIA PIPELINE 



- To build a package for console 

- Extraction pass from the editor 

- Asset compilation pass (+ dependencies follow) 

- Compression + packaging 

- Extraction can be long: 5~15 min per square kilometer 
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DATA FLOW IN THE DUNIA PIPELINE 




So distributed over night 

Entire process known as "binarization" 
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AUTOMATED TESTS IN THE DUNIA PIPELINE 




Binarization is tested with each submitted CL, alongside other regression tests 





FC3 WORST CASE SCENARIO 


-15 min 




Full editor rebuild on FarCry3 : up to 40 min 

First editor load up to 15 min because of JIT compiling 

Fist binarization up to 45 min, because of extraction pass + first compilation 
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FC3 WORST CASE SCENARIO 



BSRCRY5 



To run on console, less code: 25 min full rebuild 

+ copy latest nightly form network (2->15 min depending on network load) 
Conclusion: optimization is mandatory for FC4 with 2 new platforms to support 




The Pipeline 
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OPTIMIZING 

Optimizing Compilation Time 
Optimizing Nightly Builds 
0 Optimizing Package Synchronization 
O Optimizing Local Change Testing 






From 40 min to 4 min 


® FASTBUILD 


COMPILATION TIME: FASTBUILD 
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HISTORY 


■ Editor DLL : -40 min 

3.7 Million LOC 


■ Unity/Blob builds : -20 min 

Bad Iteration 


■ Engine Architect pet project 

FASTBuild by Franta Fulin 


■ Good State: All-in ! 

Early test show potential wins 
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PARALLELIZATION 




When compiling dependent DLL, only link steps depends on each other. Compilation 
can start as soon as possible 



PARALLELIZATION 

■ Before 


After 




MSBuild VS FastBuild 
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PARALLELIZATION 


■ Compiling : Static Libs 



MSBuild also suffer from bad scheduling for static libs, where many projects can be 
started at the same time 


PARALLELIZATION 


■ Visual Studio: 

■ Context Switching 

■ File Cache eviction 

■ 32 Core Machine: 

■ >1000 Processes! 
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PARALLELIZATION 


■ FASTBuild: 

■ NUM PROCESSORS 



Improved utilization, what next? Eliminate redundant compilation : cache 
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CACHING 






Stores object files onto central cache, then link locally 

Other user can retrieve object files directly out of the cache, then link locally 
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CACHING 
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CACHING 


Before 


After 





MSBuild VS FASTBuild with cache 
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BLOBBING 


■ Reducing header compilation 




Blobbing reduce a lot compilation time by reducing header compilation 



BLOBBING 




Problem: when you iterate, entire blob needs recompilation 

Solution: extract edited file from the blob, while maintaining other blobs stable 
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Cut compilation in 2 separate steps: 

- preprocessing, then compilation 

- Preprocessed files have no file dependencies, and can be sent for remote 
compilation 
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MISCELLANEOUS 


■ Link Dependencies 

Prevent useless relink 


■ Batching 

Build machine, pre-submit checks 
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REAL TIMINGS 

■ Editor-X64-Release DLL (before) 

■ Editor-X64-Release DLL BLOB (before) 

| ~20 mins 

■ Incredibuild 

| ~12 mins 

■ FASTBuild 

| ~io mins (local PC!) 

<4 mins (cache/distribution) 

PS4: <1 min (cache/distribution) 





ED ITS CONTINUE 


■ Too Much Code: Takes minutes 

+Microsoft dropping support 

■ DLL with Hot Reload 

Another story: lot of great content available online 

■ Incremental Linking 

Where it actually helps 




Now using hot reloadable DLL instead of edit and continue 



© 

4 Times Faster 

NIGHTLY MDS 



DUMA 

C « O t ft C 
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Improvements 


H2RCRY3 


^^V^wjSs.: 


] « 


1.5 hours 


75 Worlds 


F/SRCRY4 m 


4K 

C|5» 

PlayStation 3 xfiOX 360 XBOX ONE 


A 



XBOX 360 

hours 
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HOW? 



PROFILING 

Understand, optimize 



Profiling 

■ Multi-Process 

Process isolation 


■ Multi-Machine 

Cluster of Workers 

■ Interactions 

Working together 

■ Big Picture 

Understand 




Assets get compiled in process isolation: hard to correctly sees what are the 
interactions between the main process and worker processes 
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Remote Profiling 



Remote profiling: 

- sub process report profiling data over the network 

- Data get committed to main profiling buffer 
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Remote Profiling 


■ On Top of Regular Tool 

Custom Performance Analyzer 


■ Processes Shown as Threads 

Analyze Interactions 


■ Fix! 

Address Inefficiencies 




Remove useless steps 
Removes unnecessary sync points 
Better scheduling 
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Improvements: CPU Usage 



S3 
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Resource Caching 




Store transformed asset on a central cache, so it can be retrieved by other users 
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Resource Caching 


■ Share 

Prevent useless work 

■ Build Machine 

Populate at night 

■ Editor 

JIT Compilation (Just In Time) 



Editor benefit from the cache since it JIT compiles asset 
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NIGHTLIES 



Complex System 

Key: Understanding 



Prevent useless work 

Hardware is crucial 
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FUTURE WORK 



Prevent Editor Usage 

Current Bottleneck 



More Incremental 

Some parts are not 
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© 

From 20min to lsec 

OPTIMIZE PACKAGE SYNC. 
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SUMMER 



FARCRY3 AC3 


Network 


HSRCRY5 


Assassins 

C REED 

III 



2 major AAA games shipping at the same time 
It killed our network 
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NETWORK 
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OPTIMIZE PACKAGE SYNC. 



SOFTWARE: RTPAL 


® HARDWARE: ASSET STORE 




Problem needed to be solved at the project level (software), as well at the studio 
level (hardware) 
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Software 

SYNCH. PACKAGE: RTPAL 




We'll get to what exactly is RTPal, first get back to summer 2012 
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PC BUILD 


■ Differential Uploads 

■ Sliced ISO 


Upload 

STEAM PIPE 


■ ~50% Saving 



...shipping our PC build, using "steam pipe" form valve 


We have an advantage over steam: we know our data 


WE KNOW OUR DATA 


■ Big Files 

File boundaries 


■ Redundancy 

Same Resource, Several Worlds 


■ Amount of new Data ? 

Out of 20GB package 
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SOLUTION 


Name-~common.dat" 

Hash«"12390181022458953888" 

Size-"146445768“ 

CreationTime-" 130360306 344898459" 
LastWriteTime-" 130674356947830393“ 
LastAccessTime-"130674356947830393" 
Attributes-"1"> 


Begin-"©" 

Size-"49516" 

MO5-"02ae4cbce570fc9cdc8d656c 19776036" /> 


Begin-"49516" 

Size- "4285" 

MD5-"6aa518d432ae603ec8e49497facaa5c4" /> 


Begin»"53801" 

Size-"7691" 

MD5-"ef848a0c71b0dc64f62c838e22b2cb95" /> 


Begin-"61492" 

Size-"3118" 

HD5-"042ec74c70d3b5cbllb984553f27el9f" /> 


Begin-"64610“ 

Size-"7691" 

MD5-"ef848a0c71b0dc64f62c838e22b2cb95" /> 


Package 

SLICING 

Into Parts 










SOLUTION 
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SOLUTION 
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RESULTS 


Reuse 

95 % 

■ lGB new per 22GB package 2 Versions 


■ 40% Shrink Within Package 
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RESULTS 


■ FC3: Few weeks + Milestones 

■ FC4: >10 000 Manifests 


History 

(YEAR 

Full Prod. Proof 
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THIS IS GOOD, BUT... 


■ Regular Workflow... 

Gym/'. Test Map 


■ 20% of the package 

20% of 5% = 1% 
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THIS IS GOOD, BUT 


■ Linking... 

20 GB to write 


■ Deploy... 

zzzzz 
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ON DEMAND 

■ STREAM parts on demand 

■ OPEN WORLD game 

■ NETWORK is faster than crappy 




Streaming the parts on demand is not a problem, as we are making an open world 
game that already streams its content asynchronously 
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VIRTUALIZE File Access 


FILE SYSTEM 


■ Regular 

■ MANIFEST Content 




Requires in-game code changes to stream parts: done by virtualizing file accesses 
Several implementation then: 

- Regular disk 

- Virtualize manifest content 

- Network file system 

- ...etc 

File systems can be combined : this is the file system stack 
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FILE SYSTEM STACK 

F/5RCRY4 


'H R J-^l 
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FILE SYSTEM STACK 

HSRCRY4 
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FILE SYSTEM STACK 

F/5RCRY4 **§ 
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FILE SYSTEM STACK 

HSRCRY4 

f RTPAL 


-JL 



/ 



Network file system requires a companion app on the PC: RTPal 

Workflow described is platform agnostic: unifies the workflow for all platforms 

RTPal: 

- Runs its own file system cache 

- Retrieve manifests 

- Retrieve and deliver parts on demand 

- Do part caching 

To run a package, select it in rtpal, and set the rtpal=ip" command line 


58 








RTPAL 





1 st video: sync 5 packages in a clic 

2 nd video: change PC package by just selecting it 

3 rd video: downloading some parts, demonstrate sharing in action - By getting parts 
for one package, other package are also progressing since they are referencing the 
same parts 
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RTPAL 



STORAGE AND TRANSFER REDUCTION 

History, Transfers 

UNIFIED WORKFLOW 

For all platforms 



INSTANT PACKAGE SYNC 

l click 
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RTPAL 


Re-Invents Package Distribution 
By Getting RID of Package Distribution 



END: 38 min 
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Hardware 

SYNCH. PACKAGE: ASSET STORE 
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NEEDS 


^HIGH PERFORMANCE 

IOPS/Throughput 

QUICKLY SCALABLE, HIGH CAPACITY 

Adapt to workload 

^ROBUST 

Failure tolerance 
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HUH SOLUTIONS 

<S)ceph 




- CEPH: not good enough for IOPS in 2012 

- HADOOP HDFS: centralized server 

- OpenAFS: not mature enough 

- => redhat implementation of the Gluster FS 
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GLUSTER FS 


■ Distributed file system 

Cluster of PCs 


■ Replication between nodes 

Up to 3 copies of a file 

■ 2 Network interfaces 

One for internal data exchange 
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GLUSTER FS 


■ 8 x High End PCs 

HPZ420 (32GB RAM, 6 HTCPUs 3.2GHz) 


■ 8 x High End SDD 

On Sata3 RAlDo Controller card 


■ 2 x Network Interfaces 

10 Gb/s 




RAID Controller = LSI MegaRAID 9260-8i (PCIe 2.0, 8 lanes) 
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Ufil GLUSTER FS - 8 Nodes 




SAN/EMC 

GLUSTER 


i8MiB/s/user (13mm) 
73MiB/s/user (3mm) 



SAN/EMC 

GLUSTER 


250 000 IOPS 


544 000 IOPS 



SAN EMC VNX on NAS windows server 
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GLUSTER - COST 


Cost (k$) Speed (MiB/s) 



SAN VIOLIN : 6600 series on NAS Windows Server 
SAN EMC : VNX on NAS Windows Server 

Violin (and gluster) could probably go faster, but was limited by our testing 
infrastructure 
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USAGE SCHEMES 


■ RTPal 

Package distribution 


■ Resources caching 

Transformed assets 


■ Code compilation cache 

FASTBuild 
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FUTURE WORK 



Re-Evaluate CEPH 

Improved since 2012 



Improve Healing Process 

If still with Gluster 
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From a Day to Minutes 

LOCAL ITERATION: DEVPATCflER 
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SUMMER 



FARCRY3 

FARCRY5 


Fixes 


A « 

.* • 

• • 



Nightmare 



While shipping FC3, testing small changes, 3 solutions: 

- Change, submit, wait next day for nightly - not verry good workflow... 

- Change, binarize locally - too long... 

- Change, open bigfile by hand, locate and replace file by hand, save, deploy 
3 rd workflow was manual, but actually efficient. Could we automate it ? 
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WHAT IF 


■ Detect CHANGES 

■ Handles COMPILATION 


■ Handles BIG FILE creation 


Introducing 

Mil 

as a Workflow 



73 


CREATE PATCH 


6 

STAGES 



The "DevPatcher" is the tool that automates all those steps 
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o DETECTING CHANGES 


INPUTS COMPILE DEP. 




Understand what are all the real inputs that defines an output 

Write all of those input down into a description file (CRC of files, code version, 

parameters, ...etc) 

=> This description file is called a "Compile Dep" 
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o PACKAGING CHANGES 


NIGHTLY BUILD 




All "Compile Deps" gets packaged alongside the actual data with each nightly 


76 



o PACKAGING CHANGES 



To know if an asset has been changed locally compared to the official build, don't get 
the official build: just get compile deps 

- Examine all inputs, and compare to local disk 

- If there is a change, asset needs to be rebuild 

- All re-built assets are packaged together in a patch bigfile 
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©APPLYING CHANGES 

F/JRCRY4 j j 





- Patch bigfile gets mounted with precedence over regular package 

- Get rid of (or just unmount) the patch bigfile to get back to the official build 
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DEVPATCHER & RTPAL 



RTPal Compile Dep. STACK FS 






- Tx to rtpal, nothing gets downloaded 

- Just the compile deps -> patch BF 

- If asset in patch bf, get it, else stack fs -> download 
=> don't have to download an asset to patch it: INSANE I 
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Q 0 0 0 


DUNIA 


The Pipeline ^ 


FASTBuild 
Fast Nightly Builds 
RTPal & Asset Store 
DevPatcher 


THANKS & CREDITS 



■ Franta Fulin / FASTBuild (http://fastbuild.org/) 

■ Jean-Francois Cyr / DevPatcher 

■ Olivier Deschamps / DevPatcher 

■ Laurent Chouinard / AssetStore 

■ Jonhatan Chin / AssetStore 

■ Jocelyn Hotte / AssetStore 
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Remi QUENIN 

Q @azagoth 
remi.quenin@ubisoft.com 




